This is intended to help get people started with initial modeling in R, by going through the main steps:
This code uses ggplot2 for plotting, lubridate for some time-handing, caret as a workhorse modeling library, and car for power transformations.
A handful of utility functions are also useful.
##
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
##
## date
## Loading required package: lattice
The data is in three files. The macro(economic) data is something to add to both the test and the train records by joining on the timestamp. It makes more sense to do any cleanup of the macro records before joining, however, so will start with that.
## Warning: Removed 90 rows containing missing values (geom_path).
## Warning: Removed 31 rows containing missing values (geom_path).
## Warning: Removed 31 rows containing missing values (geom_path).
## Warning: Removed 365 rows containing missing values (geom_path).
## Warning: Removed 31 rows containing missing values (geom_path).
## Warning: Removed 3 rows containing missing values (geom_path).
## Warning: Removed 3 rows containing missing values (geom_path).
## Warning: Removed 3 rows containing missing values (geom_path).
## Warning: Removed 396 rows containing missing values (geom_path).
## Warning: Removed 365 rows containing missing values (geom_path).
## Warning: Removed 10 rows containing missing values (geom_path).
## Warning: Removed 10 rows containing missing values (geom_path).
## Warning: Removed 31 rows containing missing values (geom_path).
## Warning: Removed 414 rows containing missing values (geom_path).
## Warning: Removed 365 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 1023 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 273 rows containing missing values (geom_path).
## Warning: Removed 273 rows containing missing values (geom_path).
## Warning: Removed 273 rows containing missing values (geom_path).
## Warning: Removed 273 rows containing missing values (geom_path).
## Warning: Removed 273 rows containing missing values (geom_path).
## Warning: Removed 273 rows containing missing values (geom_path).
## Warning: Removed 273 rows containing missing values (geom_path).
## Warning: Removed 1023 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 1023 rows containing missing values (geom_path).
## Warning: Removed 1023 rows containing missing values (geom_path).
## Warning: Removed 1753 rows containing missing values (geom_path).
## Warning: Removed 1754 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 1023 rows containing missing values (geom_path).
## Warning: Removed 293 rows containing missing values (geom_path).
## Warning: Removed 1023 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 1023 rows containing missing values (geom_path).
## Warning: Removed 1023 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
## Warning: Removed 658 rows containing missing values (geom_path).
Clean up mortgag value to create a new mortgage value montonic column.
Drop a handful of columns with a very small number of values that look uninteresting.
The rent price room eco data also looks like it has a suspicious value or two in it. Remove those to impute them with something more likely.
Some modeling algorithms deal poorly with missing values, so impute values where needed. Don’t want to lose the NA-ness entirely however, so add a ‘
## [1] " add_isna_col processing gdp_quart"
## [1] " add_isna_col processing gdp_quart_growth"
## [1] " add_isna_col processing cpi"
## [1] " add_isna_col processing ppi"
## [1] " add_isna_col processing gdp_deflator"
## [1] " add_isna_col processing balance_trade"
## [1] " add_isna_col processing balance_trade_growth"
## [1] " add_isna_col processing usdrub"
## [1] " add_isna_col processing eurrub"
## [1] " add_isna_col processing brent"
## [1] " add_isna_col processing net_capital_export"
## [1] " add_isna_col processing average_provision_of_build_contract_moscow"
## [1] " add_isna_col processing rts"
## [1] " add_isna_col processing micex"
## [1] " add_isna_col processing micex_rgbi_tr"
## [1] " add_isna_col processing micex_cbi_tr"
## [1] " add_isna_col processing deposits_growth"
## [1] " add_isna_col processing deposits_rate"
## [1] " add_isna_col processing mortgage_growth"
## [1] " add_isna_col processing grp"
## [1] " add_isna_col processing grp_growth"
## [1] " add_isna_col processing income_per_cap"
## [1] " add_isna_col processing real_dispos_income_per_cap_growth"
## [1] " add_isna_col processing salary"
## [1] " add_isna_col processing salary_growth"
## [1] " add_isna_col processing retail_trade_turnover"
## [1] " add_isna_col processing retail_trade_turnover_per_cap"
## [1] " add_isna_col processing retail_trade_turnover_growth"
## [1] " add_isna_col processing labor_force"
## [1] " add_isna_col processing unemployment"
## [1] " add_isna_col processing employment"
## [1] " add_isna_col processing invest_fixed_capital_per_cap"
## [1] " add_isna_col processing invest_fixed_assets"
## [1] " add_isna_col processing profitable_enterpr_share"
## [1] " add_isna_col processing unprofitable_enterpr_share"
## [1] " add_isna_col processing share_own_revenues"
## [1] " add_isna_col processing overdue_wages_per_cap"
## [1] " add_isna_col processing fin_res_per_cap"
## [1] " add_isna_col processing marriages_per_1000_cap"
## [1] " add_isna_col processing divorce_rate"
## [1] " add_isna_col processing construction_value"
## [1] " add_isna_col processing invest_fixed_assets_phys"
## [1] " add_isna_col processing pop_natural_increase"
## [1] " add_isna_col processing pop_migration"
## [1] " add_isna_col processing pop_total_inc"
## [1] " add_isna_col processing childbirth"
## [1] " add_isna_col processing mortality"
## [1] " add_isna_col processing housing_fund_sqm"
## [1] " add_isna_col processing lodging_sqm_per_cap"
## [1] " add_isna_col processing sewerage_share"
## [1] " add_isna_col processing gas_share"
## [1] " add_isna_col processing electric_stove_share"
## [1] " add_isna_col processing average_life_exp"
## [1] " add_isna_col processing infant_mortarity_per_1000_cap"
## [1] " add_isna_col processing perinatal_mort_per_1000_cap"
## [1] " add_isna_col processing incidence_population"
## [1] " add_isna_col processing rent_price_4.room_bus"
## [1] " add_isna_col processing rent_price_3room_bus"
## [1] " add_isna_col processing rent_price_2room_bus"
## [1] " add_isna_col processing rent_price_1room_bus"
## [1] " add_isna_col processing rent_price_3room_eco"
## [1] " add_isna_col processing rent_price_2room_eco"
## [1] " add_isna_col processing rent_price_1room_eco"
## [1] " add_isna_col processing load_of_teachers_preschool_per_teacher"
## [1] " add_isna_col processing child_on_acc_pre_school"
## [1] " add_isna_col processing load_of_teachers_school_per_teacher"
## [1] " add_isna_col processing students_state_oneshift"
## [1] " add_isna_col processing provision_doctors"
## [1] " add_isna_col processing provision_nurse"
## [1] " add_isna_col processing load_on_doctors"
## [1] " add_isna_col processing power_clinics"
## [1] " add_isna_col processing hospital_beds_available_per_cap"
## [1] " add_isna_col processing hospital_bed_occupancy_per_year"
## [1] " add_isna_col processing turnover_catering_per_cap"
## [1] " add_isna_col processing theaters_viewers_per_1000_cap"
## [1] " add_isna_col processing seats_theather_rfmin_per_100000_cap"
## [1] " add_isna_col processing museum_visitis_per_100_cap"
## [1] " add_isna_col processing bandwidth_sports"
## [1] " add_isna_col processing population_reg_sports_share"
## [1] " add_isna_col processing students_reg_sports_share"
## [1] " add_isna_col processing apartment_build"
## [1] " add_isna_col processing apartment_fund_sqm"
## [1] "after imputation, have 37507 NAs in macro data"
## [1] "fixing column mortgage_value with power 0.358032"
## [1] "fixing column mortgage_value_montonic with power 0.399137"
Load the property-specific data, and fix column types. It’s actually easiest to do this by combining test and train into one data frame, and cleaning that.
Break overall_train into three sets, train, validate and test as 60:20:20 split
Have life sq < 5 or full sq < 5 is probably an error. Remove to impute later. Also have full sq < life sq, usually by a lot. Guess that this is a coding error and full is accidentally recorded as the extra.
## Warning: Transformation introduced infinite values in continuous x-axis
## Warning: Transformation introduced infinite values in continuous y-axis
## Warning: Removed 7559 rows containing missing values (geom_point).
Have max floor < floor in 2136 cases. Assume floor is more accurate than max floor.
qplot(floor, max_floor, data=overall_data)
## Warning: Removed 9572 rows containing missing values (geom_point).
idx <- with(overall_data, full_sq < life_sq)
overall_data$max_floor[idx] <- NA
Have only two points, one test and one train, of material==3. Drop.
qplot(material, data=overall_data)
idx <- with(overall_data, material == 3)
overall_data$material[idx] <- NA
Build year spans a huge range, from 0 to 20052009. Assume numbers before 1860 are bad, as is 4965. Convert 20092005 into 2007 as a guess.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 14654 rows containing non-finite values (stat_bin).
## subset(overall_data, build_year < 1860 | build_year > 2017)$build_year
## 0 1 2 3 20 71 215 1691
## 899 555 1 2 1 1 2 1
## 2018 2019 4965 20052009
## 31 5 1 1
Less than 1 room is probably incorrect.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 9572 rows containing non-finite values (stat_bin).
Have some kitchen sq that look like build years. Use those to guess a build year. Have many that look the same size (or bigger than) the life or full sq. Clear those.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 9572 rows containing non-finite values (stat_bin).
Add a value-is-missing column and remove near-zero variance columns, in prep for imputing values and transforming.
## [1] " add_isna_col processing full_sq"
## [1] " add_isna_col processing life_sq"
## [1] " add_isna_col processing floor"
## [1] " add_isna_col processing max_floor"
## [1] " add_isna_col processing material"
## Warning in tmp[isna_idx] <- na.value: number of items to replace is not a
## multiple of replacement length
## [1] " add_isna_col processing build_year"
## [1] " add_isna_col processing num_room"
## [1] " add_isna_col processing kitch_sq"
## [1] " add_isna_col processing state"
## Warning in tmp[isna_idx] <- na.value: number of items to replace is not a
## multiple of replacement length
## [1] " add_isna_col processing product_type"
## Warning in tmp[isna_idx] <- na.value: number of items to replace is not a
## multiple of replacement length
## [1] " add_isna_col processing preschool_quota"
## [1] " add_isna_col processing school_quota"
## [1] " add_isna_col processing hospital_beds_raion"
## [1] " add_isna_col processing raion_build_count_with_material_info"
## [1] " add_isna_col processing build_count_block"
## [1] " add_isna_col processing build_count_wood"
## [1] " add_isna_col processing build_count_frame"
## [1] " add_isna_col processing build_count_brick"
## [1] " add_isna_col processing build_count_monolith"
## [1] " add_isna_col processing build_count_panel"
## [1] " add_isna_col processing build_count_foam"
## [1] " add_isna_col processing build_count_slag"
## [1] " add_isna_col processing build_count_mix"
## [1] " add_isna_col processing raion_build_count_with_builddate_info"
## [1] " add_isna_col processing build_count_before_1920"
## [1] " add_isna_col processing build_count_1921.1945"
## [1] " add_isna_col processing build_count_1946.1970"
## [1] " add_isna_col processing build_count_1971.1995"
## [1] " add_isna_col processing build_count_after_1995"
## [1] " add_isna_col processing metro_min_walk"
## [1] " add_isna_col processing metro_km_walk"
## [1] " add_isna_col processing railroad_station_walk_km"
## [1] " add_isna_col processing railroad_station_walk_min"
## [1] " add_isna_col processing ID_railroad_station_walk"
## Warning in tmp[isna_idx] <- na.value: number of items to replace is not a
## multiple of replacement length
## [1] " add_isna_col processing cafe_sum_500_min_price_avg"
## [1] " add_isna_col processing cafe_sum_500_max_price_avg"
## [1] " add_isna_col processing cafe_avg_price_500"
## [1] " add_isna_col processing cafe_sum_1000_min_price_avg"
## [1] " add_isna_col processing cafe_sum_1000_max_price_avg"
## [1] " add_isna_col processing cafe_avg_price_1000"
## [1] " add_isna_col processing cafe_sum_1500_min_price_avg"
## [1] " add_isna_col processing cafe_sum_1500_max_price_avg"
## [1] " add_isna_col processing cafe_avg_price_1500"
## [1] " add_isna_col processing green_part_2000"
## [1] " add_isna_col processing cafe_sum_2000_min_price_avg"
## [1] " add_isna_col processing cafe_sum_2000_max_price_avg"
## [1] " add_isna_col processing cafe_avg_price_2000"
## [1] " add_isna_col processing cafe_sum_3000_min_price_avg"
## [1] " add_isna_col processing cafe_sum_3000_max_price_avg"
## [1] " add_isna_col processing cafe_avg_price_3000"
## [1] " add_isna_col processing prom_part_5000"
## [1] " add_isna_col processing cafe_sum_5000_min_price_avg"
## [1] " add_isna_col processing cafe_sum_5000_max_price_avg"
## [1] " add_isna_col processing cafe_avg_price_5000"
## [1] " add_isna_col processing price_doc"
Use caret::preProcess to impute.
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 3690557 197.1 6861544 366.5 5831138 311.5
## Vcells 192034895 1465.2 281230277 2145.7 281227208 2145.6
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 3690958 197.2 6861544 366.5 5831138 311.5
## Vcells 192035076 1465.2 281230277 2145.7 281227208 2145.6
## [0] train-rmse:14.819591
## [1] train-rmse:14.523707
## [2] train-rmse:14.233721
## [3] train-rmse:13.949535
## [4] train-rmse:13.671043
## [5] train-rmse:13.398122
## [6] train-rmse:13.130665
## [7] train-rmse:12.868619
## [8] train-rmse:12.611794
## [9] train-rmse:12.360188
## [10] train-rmse:12.113395
## [11] train-rmse:11.871696
## [12] train-rmse:11.634833
## [13] train-rmse:11.402686
## [14] train-rmse:11.175255
## [15] train-rmse:10.952374
## [16] train-rmse:10.734005
## [17] train-rmse:10.519939
## [18] train-rmse:10.310135
## [19] train-rmse:10.104543
## [20] train-rmse:9.903119
## [21] train-rmse:9.705624
## [22] train-rmse:9.512130
## [23] train-rmse:9.322547
## [24] train-rmse:9.136763
## [25] train-rmse:8.954733
## [26] train-rmse:8.776309
## [27] train-rmse:8.601361
## [28] train-rmse:8.429941
## [29] train-rmse:8.262036
## [30] train-rmse:8.097532
## [31] train-rmse:7.936285
## [32] train-rmse:7.778323
## [33] train-rmse:7.623520
## [34] train-rmse:7.471877
## [35] train-rmse:7.323229
## [36] train-rmse:7.177491
## [37] train-rmse:7.034723
## [38] train-rmse:6.894812
## [39] train-rmse:6.757720
## [40] train-rmse:6.623443
## [41] train-rmse:6.491847
## [42] train-rmse:6.362790
## [43] train-rmse:6.236374
## [44] train-rmse:6.112549
## [45] train-rmse:5.991149
## [46] train-rmse:5.872218
## [47] train-rmse:5.755701
## [48] train-rmse:5.641428
## [49] train-rmse:5.529570
## [50] train-rmse:5.419989
## [51] train-rmse:5.312630
## [52] train-rmse:5.207343
## [53] train-rmse:5.104250
## [54] train-rmse:5.003148
## [55] train-rmse:4.904226
## [56] train-rmse:4.807137
## [57] train-rmse:4.712048
## [58] train-rmse:4.618932
## [59] train-rmse:4.527616
## [60] train-rmse:4.438241
## [61] train-rmse:4.350654
## [62] train-rmse:4.264820
## [63] train-rmse:4.180638
## [64] train-rmse:4.098250
## [65] train-rmse:4.017524
## [66] train-rmse:3.938519
## [67] train-rmse:3.860981
## [68] train-rmse:3.785117
## [69] train-rmse:3.710754
## [70] train-rmse:3.637921
## [71] train-rmse:3.566610
## [72] train-rmse:3.496535
## [73] train-rmse:3.428023
## [74] train-rmse:3.360896
## [75] train-rmse:3.295150
## [76] train-rmse:3.230746
## [77] train-rmse:3.167717
## [78] train-rmse:3.105929
## [79] train-rmse:3.045414
## [80] train-rmse:2.986019
## [81] train-rmse:2.927836
## [82] train-rmse:2.870990
## [83] train-rmse:2.815302
## [84] train-rmse:2.760715
## [85] train-rmse:2.707295
## [86] train-rmse:2.654935
## [87] train-rmse:2.603567
## [88] train-rmse:2.553342
## [89] train-rmse:2.504112
## [90] train-rmse:2.455909
## [91] train-rmse:2.408812
## [92] train-rmse:2.362482
## [93] train-rmse:2.317208
## [94] train-rmse:2.272817
## [95] train-rmse:2.229383
## [96] train-rmse:2.186774
## [97] train-rmse:2.145234
## [98] train-rmse:2.104442
## [99] train-rmse:2.064558
## [100] train-rmse:2.025598
## [101] train-rmse:1.987280
## [102] train-rmse:1.949822
## [103] train-rmse:1.913244
## [104] train-rmse:1.877358
## [105] train-rmse:1.842331
## [106] train-rmse:1.807895
## [107] train-rmse:1.774220
## [108] train-rmse:1.741348
## [109] train-rmse:1.709069
## [110] train-rmse:1.677510
## [111] train-rmse:1.646662
## [112] train-rmse:1.616403
## [113] train-rmse:1.586979
## [114] train-rmse:1.558106
## [115] train-rmse:1.529873
## [116] train-rmse:1.502290
## [117] train-rmse:1.475201
## [118] train-rmse:1.448665
## [119] train-rmse:1.422664
## [120] train-rmse:1.397303
## [121] train-rmse:1.372517
## [122] train-rmse:1.348372
## [123] train-rmse:1.324711
## [124] train-rmse:1.301589
## [125] train-rmse:1.278908
## [126] train-rmse:1.256824
## [127] train-rmse:1.235214
## [128] train-rmse:1.214101
## [129] train-rmse:1.193490
## [130] train-rmse:1.173290
## [131] train-rmse:1.153598
## [132] train-rmse:1.134308
## [133] train-rmse:1.115433
## [134] train-rmse:1.097010
## [135] train-rmse:1.079059
## [136] train-rmse:1.061582
## [137] train-rmse:1.044466
## [138] train-rmse:1.027709
## [139] train-rmse:1.011398
## [140] train-rmse:0.995396
## [141] train-rmse:0.979839
## [142] train-rmse:0.964680
## [143] train-rmse:0.949779
## [144] train-rmse:0.935252
## [145] train-rmse:0.921164
## [146] train-rmse:0.907387
## [147] train-rmse:0.893918
## [148] train-rmse:0.880836
## [149] train-rmse:0.868088
## [150] train-rmse:0.855656
## [151] train-rmse:0.843537
## [152] train-rmse:0.831788
## [153] train-rmse:0.820301
## [154] train-rmse:0.809199
## [155] train-rmse:0.798297
## [156] train-rmse:0.787633
## [157] train-rmse:0.777259
## [158] train-rmse:0.767249
## [159] train-rmse:0.757443
## [160] train-rmse:0.747936
## [161] train-rmse:0.738651
## [162] train-rmse:0.729582
## [163] train-rmse:0.720798
## [164] train-rmse:0.712247
## [165] train-rmse:0.703933
## [166] train-rmse:0.695763
## [167] train-rmse:0.687947
## [168] train-rmse:0.680321
## [169] train-rmse:0.672951
## [170] train-rmse:0.665688
## [171] train-rmse:0.658743
## [172] train-rmse:0.651968
## [173] train-rmse:0.645391
## [174] train-rmse:0.638919
## [175] train-rmse:0.632725
## [176] train-rmse:0.626686
## [177] train-rmse:0.620814
## [178] train-rmse:0.615081
## [179] train-rmse:0.609539
## [180] train-rmse:0.604140
## [181] train-rmse:0.598933
## [182] train-rmse:0.593875
## [183] train-rmse:0.588901
## [184] train-rmse:0.584169
## [185] train-rmse:0.579558
## [186] train-rmse:0.575092
## [187] train-rmse:0.570799
## [188] train-rmse:0.566613
## [189] train-rmse:0.562603
## [190] train-rmse:0.558608
## [191] train-rmse:0.554846
## [192] train-rmse:0.551244
## [193] train-rmse:0.547722
## [194] train-rmse:0.544283
## [195] train-rmse:0.540938
## [196] train-rmse:0.537673
## [197] train-rmse:0.534556
## [198] train-rmse:0.531517
## [199] train-rmse:0.528559
## [200] train-rmse:0.525726
## [201] train-rmse:0.522981
## [202] train-rmse:0.520252
## [203] train-rmse:0.517709
## [204] train-rmse:0.515265
## [205] train-rmse:0.512886
## [206] train-rmse:0.510571
## [207] train-rmse:0.508290
## [208] train-rmse:0.506158
## [209] train-rmse:0.504039
## [210] train-rmse:0.501995
## [211] train-rmse:0.500043
## [212] train-rmse:0.498147
## [213] train-rmse:0.496366
## [214] train-rmse:0.494609
## [215] train-rmse:0.492884
## [216] train-rmse:0.491295
## [217] train-rmse:0.489736
## [218] train-rmse:0.488173
## [219] train-rmse:0.486729
## [220] train-rmse:0.485269
## [221] train-rmse:0.483875
## [222] train-rmse:0.482535
## [223] train-rmse:0.481148
## [224] train-rmse:0.479884
## [225] train-rmse:0.478725
## [226] train-rmse:0.477508
## [227] train-rmse:0.476356
## [228] train-rmse:0.475257
## [229] train-rmse:0.474185
## [230] train-rmse:0.473120
## [231] train-rmse:0.472070
## [232] train-rmse:0.471090
## [233] train-rmse:0.470082
## [234] train-rmse:0.469193
## [235] train-rmse:0.468300
## [236] train-rmse:0.467468
## [237] train-rmse:0.466650
## [238] train-rmse:0.465885
## [239] train-rmse:0.465114
## [240] train-rmse:0.464407
## [241] train-rmse:0.463649
## [242] train-rmse:0.462963
## [243] train-rmse:0.462313
## [244] train-rmse:0.461631
## [245] train-rmse:0.460999
## [246] train-rmse:0.460355
## [247] train-rmse:0.459743
## [248] train-rmse:0.459180
## [249] train-rmse:0.458619
## [250] train-rmse:0.458109
## [251] train-rmse:0.457562
## [252] train-rmse:0.457063
## [253] train-rmse:0.456581
## [254] train-rmse:0.456077
## [255] train-rmse:0.455636
## [256] train-rmse:0.455219
## [257] train-rmse:0.454795
## [258] train-rmse:0.454377
## [259] train-rmse:0.453963
## [260] train-rmse:0.453587
## [261] train-rmse:0.453230
## [262] train-rmse:0.452841
## [263] train-rmse:0.452496
## [264] train-rmse:0.452162
## [265] train-rmse:0.451850
## [266] train-rmse:0.451498
## [267] train-rmse:0.451151
## [268] train-rmse:0.450853
## [269] train-rmse:0.450554
## [270] train-rmse:0.450239
## [271] train-rmse:0.449953
## [272] train-rmse:0.449669
## [273] train-rmse:0.449380
## [274] train-rmse:0.449101
## [275] train-rmse:0.448838
## [276] train-rmse:0.448554
## [277] train-rmse:0.448277
## [278] train-rmse:0.448038
## [279] train-rmse:0.447821
## [280] train-rmse:0.447529
## [281] train-rmse:0.447282
## [282] train-rmse:0.447027
## [283] train-rmse:0.446817
## [284] train-rmse:0.446610
## [285] train-rmse:0.446402
## [286] train-rmse:0.446187
## [287] train-rmse:0.445975
## [288] train-rmse:0.445739
## [289] train-rmse:0.445506
## [290] train-rmse:0.445317
## [291] train-rmse:0.445091
## [292] train-rmse:0.444903
## [293] train-rmse:0.444668
## [294] train-rmse:0.444473
## [295] train-rmse:0.444297
## [296] train-rmse:0.444115
## [297] train-rmse:0.443968
## [298] train-rmse:0.443807
## [299] train-rmse:0.443637
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 3692714 197.3 6861544 366.5 5831138 311.5
## Vcells 192077515 1465.5 281230277 2145.7 281227208 2145.6